Tutorial Part VI — Population analysis in R

Introduction

This is a brief tutorial about constructing and analyzing relatively simple Cormack-Jolly-Seber capture-mark-recapture (CMR) models in R. For this we use the R package “marked” (Laake et al. 2013).

First we get an overview of the data, then we develop different models and finally we analyse and visualize the results.

Setup

Load (and install) the following packages:

package.list=c("here",
               "marked",
               "skimr",
               "sf",
               "tmap",
               "devtools",
               "rnaturalearthdata", 
               "ggplot2",
               "readr",
               "tidyr")

for (package in package.list) {
  if (!require(package, character.only=T, quietly=T)) {
    install.packages(package) ##take out for knitting
    library(package, character.only=T)
  }
}
remotes::install_github("EcoDynIZW/d6berlin") ##take out for knitting
library("d6berlin")
# set a seed, i.e. we create the same random numbers
set.seed(123)

set the theme for some ggplots:

theme_set(
  theme(
    panel.grid = element_blank(),
    panel.background = element_rect(fill = "white"),
    panel.border = element_rect(
      colour = "black",
      fill = NA,
      linewidth = 1
    ),
    axis.ticks = element_line(colour = "black", linewidth = 1),
    axis.ticks.length = unit(0.2, "cm")
  )
)

Load and manipulate data

Here, we want to directly set some columns as factors for all subsequent analyses:

starlings <- readr::read_csv(here("data", "data_berlin", "animal_data", "starlings.csv"))

starlings$sex <- as.factor(starlings$sex) # convert to factor
starlings$habitat <- as.factor(starlings$habitat) # convert to factor
starlings$infected <- as.factor(starlings$infected) # convert to factor
names(starlings)[1] <- "ID" # rename first column to ID column

# Short overview
skim(starlings)
Data summary
Name starlings
Number of rows 605
Number of columns 5
_______________________
Column type frequency:
character 1
factor 3
numeric 1
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ch 0 1 7 7 0 32 0

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
sex 0 1.00 FALSE 2 Fem: 323, Mal: 282
habitat 0 1.00 FALSE 2 urb: 311, rur: 294
infected 202 0.67 FALSE 2 Yes: 214, No: 189

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
ID 0 1 303 174.79 1 152 303 454 605 ▇▇▇▇▇

Each row in this table represents one individual. We have 5 columns in this dataset: ID, the individual identifier, ch, the capture history (0 = undetected, 1 = detected),the sex of the captured individuals, the habitat type the birds were captured, and health status, i.e. if birds were infected

Explore capture locations

#load environmental data on capturing
locations <-
  read.csv(here(
    "data",
    "data_berlin",
    "animal_data",
    "starlings_capture_location_data.csv"
  ))

# Short overview
skim(locations)
Data summary
Name locations
Number of rows 2
Number of columns 10
_______________________
Column type frequency:
character 1
numeric 9
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
Habitat 0 1 5 5 0 2 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Lon 0 1 13.57 0.19 13.44 13.51 13.57 13.64 13.71 ▇▁▁▁▇
Lat 0 1 52.44 0.05 52.40 52.42 52.44 52.45 52.47 ▇▁▁▁▇
temperature_2016 0 1 30.25 1.06 29.50 29.88 30.25 30.62 31.00 ▇▁▁▁▇
temperature_2017 0 1 39.00 5.66 35.00 37.00 39.00 41.00 43.00 ▇▁▁▁▇
temperature_2018 0 1 33.50 3.54 31.00 32.25 33.50 34.75 36.00 ▇▁▁▁▇
temperature_2019 0 1 28.50 2.12 27.00 27.75 28.50 29.25 30.00 ▇▁▁▁▇
temperature_2020 0 1 28.50 0.71 28.00 28.25 28.50 28.75 29.00 ▇▁▁▁▇
temperature_2021 0 1 30.00 1.41 29.00 29.50 30.00 30.50 31.00 ▇▁▁▁▇
temperature_2022 0 1 29.50 1.41 28.50 29.00 29.50 30.00 30.50 ▇▁▁▁▇
#transform our locations into a spatial sf-object
loc_sf <- st_as_sf(locations, coords = c("Lon", "Lat"), crs = 4326)

#plot the locations
d6berlin::base_map_imp(globe = TRUE, resolution = 500)+
  geom_sf(data=st_transform(loc_sf, crs=3035),aes(col=Habitat),size=5)+
  guides(color = guide_legend(direction = "horizontal",
                              title.position = "top", 
                              title.hjust = .5))

# or interactive plot:

# tmap_mode("view")
# tm_shape(loc_sf) + tm_dots(col = "Habitat", size = 2)

Plot the capturing data

# Capture histories

ggplot(data = starlings, mapping = aes(x = ch)) +
  geom_bar(colour = "#262674", fill = "#262674") +
  scale_y_continuous(
    limits = c(0, 40),
    breaks = seq(0, 40, by = 10),
    expand = c(.001, .001)
  ) +
  labs(x = "capture histories") +
  theme(axis.text.x = element_text(angle = 45))

# Sex
ggplot(data = starlings,mapping = aes(x = sex, fill = sex)) +
  geom_bar() +
  scale_fill_manual(values = c(Female = "salmon", Male = "lightblue"))

# Sex and capture histories
ggplot(data = starlings, mapping = aes(x = ch, fill = sex)) +
  geom_bar(position = "stack") +
  scale_y_continuous(
    limits = c(0, 40),
    breaks = seq(0, 40, by = 10),
    expand = c(.001, .001)
  ) +
  labs(x = "capture histories", fill = "sex") +
  theme(axis.text.x = element_text(angle = 45)) +
  scale_fill_manual(values = c(Female = "salmon", Male = "lightblue"))

Plot environmental data

#data can be in wide or long format, often long-format is preferred
locations_long <- pivot_longer(
  data = locations,
  names_to = "year",
  values_to = "temp",
  cols = starts_with("Temp"))


# check max. temperatures between habitats

ggplot(locations_long, aes(x = year, y = temp, group = Habitat)) +
  geom_line(aes(col = Habitat), linewidth = 2) +
  scale_colour_manual(values = c(urban = "darkgoldenrod", rural = "darkblue"))+
  theme(axis.text.x = element_text(angle = 45,hjust=0.5,vjust=0.5))

Basic Cormack-Jolly-Seber models

The default model uses constant detection and survival probabilities and constant time intervals between captures. We fit the model and have a look at the results. Please note: The estimates are on a logit scale

#models require data frames, tibbles produce errors.
starlings <- as.data.frame(starlings)
#apply basic crm model
cjs_m1 <- crm(data = starlings)
#take a look
cjs_m1
## 
## crm Model Summary
## 
## Npar :  2
## -2lnL:  1291.825
## AIC  :  1295.825
## 
## Beta
##                   Estimate
## Phi.(Intercept) 0.06757843
## p.(Intercept)   2.22899510

Npar = number of parameters -2lnL = 2 log(Likelihood) of the fitted model AIC = Akaikes Information Criterion Phi = Survival probability between capture events p = capture probability per capture event

Estimate parameters on real scale

Write a function to assess results more intuitively on the real scale instead of the logit-scale. We can just paste our results from the previous model output

# write function
inverse_logit <- function(x) {
  exp(x) / (1 + exp(x))
  
}

#e.g.
inverse_logit(0.06757844) # this is our survival probability
## [1] 0.5168882
inverse_logit(2.22899510) # this is our capture probability
## [1] 0.9028232
#  alternative way to write this

inverse_logit(cjs_m1$results$beta$Phi)
## (Intercept) 
##   0.5168882
inverse_logit(cjs_m1$results$beta$p)
## (Intercept) 
##   0.9028232

Refit the model with confidence intervals

cjs_m1_2 <- cjs.hessian(cjs_m1)
cjs_m1_2
## 
## crm Model Summary
## 
## Npar :  2
## -2lnL:  1291.825
## AIC  :  1295.825
## 
## Beta
##                   Estimate         se         lcl       ucl
## Phi.(Intercept) 0.06757843 0.07249005 -0.07450207 0.2096589
## p.(Intercept)   2.22899510 0.24951951  1.73993685 2.7180533

Predict

Now we transform the logit estimates to real estimates by using the inverse logit

predict(object = cjs_m1_2,
        newdata = data.frame(sex = c("Female", "Male")),
        se = TRUE)
## $Phi
##   occ  estimate         se       lcl       ucl
## 1   1 0.5168882 0.01810184 0.4813831 0.5522236
## 
## $p
##   occ  estimate         se      lcl       ucl
## 1   2 0.9028232 0.02189121 0.850679 0.9380836

What does that mean? The survival probability between capture events was 0.52. The probability to capture an individual during a capture event was 0.90. Let’s visualize this in a plot:

# store prediction as object
mypred_list <- predict(object = cjs_m1_2,
            newdata = data.frame(sex = c("Female", "Male")),
            se = TRUE)
#mypred_list ## this is a list. Convert into data.frame:
mypred_df <- do.call(rbind, mypred_list)
## add the rownames phi and p as column
mypred_df$prob <- rownames(mypred_df)


ggplot(data=mypred_df, aes(x=prob, y=estimate, color = prob)) + 
      geom_point(size = 5) +
      geom_errorbar(aes(ymin = lcl, ymax = ucl, col = prob),width =  0.25) +
      ggtitle("detection (p) and survival prob. (phi)") + 
      xlab("result") + ylab("estimate") +
      theme_bw(base_size = 20) 

## Tip for great plots:  
##https://github.com/CerrenRichards/ggplot2-for-publications

As mentioned before, we assume constant time intervals between capturing events. In reality, this is often not the case, e.g. capturing events have to be postponed due to weather conditions. Now we build a model where we take this into account.

CJS with irregular sampling intervals

cjs_m1_time <- crm(data = starlings,
                   # time.interval vector is the interval between each capture event.
                   # for 7 capture events, there are 6 intervals
                   time.intervals = c(1, 1, 1, 1.6, 2, 3))

# predict model
predict(object = cjs_m1_time)
## $Phi
##   occ estimate
## 1   6 0.688108
## 
## $p
##   occ  estimate
## 1   7 0.8858232

Now we have a higher survival probability of 0.69 between capturing events and a lower capture probability per capturing event of 0.89

Exercise

We have sampled two different populations in urban and rural habitat. Split the data set (Hint: use [] or dplyr::filter()) and compare survival probabilities between the two habitats

# your code in here

Fit multiple models simultaneously

Normally, there are different hypotheses about the factors that influence the survival probability of individuals. These can be characteristics of the individuals, such as sex, age or weight, or characteristics of the environment of the individuals, such as sealing, temperature or the number of nesting opportunities. For this purpose, different models are built that take the respective factors (explanatory or predictor variables) into account.

For a flexible approach we process the data, create design data to add variables that depend on time and fit the different models.

Group data

We process the data and define the grouping variables

m2_proc <- process.data(data = starlings,groups = c("sex","habitat","infected"))

head(m2_proc[[1]])
##    ID            ch    sex habitat infected freq id group initial.age
## 40 40 0,0,0,0,0,1,0 Female   rural      Yes    1  1     5           0
## 41 41 0,0,0,0,0,1,0 Female   rural      Yes    1  2     5           0
## 42 42 0,0,0,0,0,1,0 Female   rural       No    1  3     1           0
## 43 43 0,0,0,0,0,1,0 Female   rural      Yes    1  4     5           0
## 44 44 0,0,0,0,0,1,0 Female   rural       No    1  5     1           0
## 45 45 0,0,0,0,0,1,0 Female   rural      Yes    1  6     5           0

Create the design data

m2_design <- make.design.data(data = m2_proc)

head(m2_design[[1]])
##   id occ time cohort age ID    sex habitat infected freq          group Time
## 1  1   1    1      6   0 40 Female   rural      Yes    1 FemaleruralYes    0
## 2  1   2    2      6   0 40 Female   rural      Yes    1 FemaleruralYes    1
## 3  1   3    3      6   0 40 Female   rural      Yes    1 FemaleruralYes    2
## 4  1   4    4      6   0 40 Female   rural      Yes    1 FemaleruralYes    3
## 5  1   5    5      6   0 40 Female   rural      Yes    1 FemaleruralYes    4
## 6  1   6    6      6   0 40 Female   rural      Yes    1 FemaleruralYes    5
##   Cohort Age order
## 1      5   0     1
## 2      5   0     2
## 3      5   0     3
## 4      5   0     4
## 5      5   0     5
## 6      5   0     6

Model formulas

Build formulas for each parameter

phi_dot <- list(formula = ~ 1)
phi_sex <- list(formula = ~ sex)
p_sex <- list(formula = ~ sex)

M2

Fit a model based on the design data with constant survival and different detection probabilities between sexes.

cjs_m2 <- crm(
  m2_proc,
  m2_design,
  model.parameters = list(Phi = phi_dot,
                          p = p_sex)
)
##  Number of evaluations:  100  -2lnl: 1289.126943
cjs_m2
## 
## crm Model Summary
## 
## Npar :  3
## -2lnL:  1289.117
## AIC  :  1295.117
## 
## Beta
##                  Estimate
## Phi.(Intercept) 0.0793894
## p.(Intercept)   1.7877066
## p.sexMale       0.7906268
predict(object = cjs_m2)
## $Phi
##   occ  estimate
## 1   6 0.5198369
## 
## $p
##      sex occ  estimate
## 1 Female   7 0.8566459
## 2   Male   7 0.9294541

M3

Fit another model based on the design data with different survival and detection probabilities between sexes

cjs_m3 <- crm(
  m2_proc,
  m2_design,
  model.parameters = list(Phi = phi_sex,
                          p = p_sex),
  accumulate = FALSE
)
##  Number of evaluations:  100  -2lnl: 1283.680512 Number of evaluations:  200  -2lnl: 1283.680512
cjs_m3
## 
## crm Model Summary
## 
## Npar :  4
## -2lnL:  1283.681
## AIC  :  1291.681
## 
## Beta
##                    Estimate
## Phi.(Intercept) -0.09718597
## Phi.sexMale      0.34094477
## p.(Intercept)    1.98842948
## p.sexMale        0.46984162
predict(object = cjs_m3)
## $Phi
##      sex occ  estimate
## 1 Female   6 0.4757226
## 2   Male   6 0.5606397
## 
## $p
##      sex occ  estimate
## 1 Female   7 0.8795769
## 2   Male   7 0.9211642

M4

Fit another model based on the design data with time-dependent survival and detection probability time is a variable that was introduced by the make.design.data() function

cjs_m4 <- crm(
  m2_proc,
  m2_design,
  model.parameters = list(Phi = list(formula =  ~ time),
                          p = list(formula =  ~ time)),
  accumulate = FALSE
)
##  Number of evaluations:  100  -2lnl: 1244.670707 Number of evaluations:  200  -2lnl: 1239.725494 Number of evaluations:  300  -2lnl: 1239.319519 Number of evaluations:  400  -2lnl: 1239.303701 Number of evaluations:  500  -2lnl: 1239.314667 Number of evaluations:  600  -2lnl: 1239.332245 Number of evaluations:  700  -2lnl: 1240.362774 Number of evaluations:  800  -2lnl: 1239.305679 Number of evaluations:  900  -2lnl: 1239.798211 Number of evaluations:  1000  -2lnl:  1239.31019 Number of evaluations:  1100  -2lnl: 1239.303675
cjs_m4
## 
## crm Model Summary
## 
## Npar :  12
## -2lnL:  1239.304
## AIC  :  1263.304
## 
## Beta
##                   Estimate
## Phi.(Intercept)  1.5987971
## Phi.time2       -2.6225405
## Phi.time3       -1.8076489
## Phi.time4       -1.1463136
## Phi.time5       -1.3796043
## Phi.time6       -0.6965196
## p.(Intercept)    1.2003709
## p.time3          1.5729892
## p.time4          1.2288969
## p.time5          0.7978441
## p.time6          1.2524201
## p.time7         -0.2257743
predict(object = cjs_m4)
## $Phi
##   time occ  estimate
## 1    6   6 0.7114173
## 2    5   5 0.5545798
## 3    4   4 0.6112295
## 4    3   3 0.4479760
## 5    2   2 0.2642989
## 6    1   1 0.8318502
## 
## $p
##   time occ  estimate
## 1    7   7 0.7260348
## 2    6   6 0.9207653
## 3    5   5 0.8806095
## 4    4   4 0.9190321
## 5    3   3 0.9412192
## 6    2   2 0.7685908

Multiple models + model selection (AIC)

When testing multiple hypotheses, you can speed up model formulation for multiple models.

Caution! Models should always represent hypotheses according to biological meaningful questions, don’t just fit many models without thinking about them. Generally, formulate hypotheses before formulating models!

fit.starlings.cjs.models <- function() {
  # Apparent survival (Phi) formula
  Phi.time <- list(formula = ~ time)
  Phi.sex <- list(formula = ~ sex)
  Phi.sex.time <-
    list(formula = ~ sex * time) # interaction of sex and time
  Phi.habitat.time <- list(formula = ~ habitat * time)
  Phi.dot <- list(formula = ~ 1) # constant survival
  
  # Detection probability (p) formula
  p.sex <- list(formula = ~ sex)  # differs between males and females
  p.time <-
    list(formula = ~ time)  # one discrete estimate of p per capture event
  p.habitat <- list(formula = ~ habitat)
  p.dot <- list(formula = ~ 1) # constant detection
  
  # Construct all combinations and put into one model table
  cml <-
    create.model.list(c("Phi", "p")) # makes all possibile combinations of those parameter formulas
  results <- crm.wrapper(
    cml,
    data = m2_proc,
    ddl = m2_design,
    external = FALSE,
    accumulate = FALSE,
    hessian = TRUE
  )
  return(results)
}
# Run function
starlings_cjs_models <- fit.starlings.cjs.models()

Model fitting summary

starlings_cjs_models
##                              model npar      AIC  DeltaAIC       weight
## 13           Phi(~sex * time)p(~1)   13 1251.441  0.000000 2.983796e-01
## 15         Phi(~sex * time)p(~sex)   14 1252.538  1.096923 1.724152e-01
## 5        Phi(~habitat * time)p(~1)   13 1252.975  1.534076 1.385635e-01
## 14     Phi(~sex * time)p(~habitat)   14 1253.052  1.610556 1.333648e-01
## 7      Phi(~habitat * time)p(~sex)   14 1253.221  1.780076 1.225267e-01
## 6  Phi(~habitat * time)p(~habitat)   14 1254.767  3.325259 5.658460e-02
## 17                 Phi(~time)p(~1)    7 1256.353  4.911634 2.559891e-02
## 19               Phi(~time)p(~sex)    8 1256.478  5.036444 2.405023e-02
## 18           Phi(~time)p(~habitat)    8 1257.978  6.536567 1.135982e-02
## 16        Phi(~sex * time)p(~time)   18 1258.130  6.688597 1.052831e-02
## 8     Phi(~habitat * time)p(~time)   18 1259.310  7.868629 5.836038e-03
## 20              Phi(~time)p(~time)   12 1263.304 11.862290 7.923290e-04
## 9                   Phi(~sex)p(~1)    3 1290.564 39.122357 9.538000e-10
## 11                Phi(~sex)p(~sex)    4 1291.681 40.239128 5.456996e-10
## 10            Phi(~sex)p(~habitat)    4 1292.362 40.920736 3.881004e-10
## 12               Phi(~sex)p(~time)    8 1294.274 42.832830 1.491895e-10
## 3                   Phi(~1)p(~sex)    3 1295.117 43.675758 9.788110e-11
## 1                     Phi(~1)p(~1)    2 1295.825 44.383660 6.870364e-11
## 2               Phi(~1)p(~habitat)    3 1297.634 46.193096 2.780135e-11
## 4                  Phi(~1)p(~time)    7 1298.782 47.340491 1.566434e-11
##     neg2lnl convergence
## 13 1225.441           0
## 15 1224.538           0
## 5  1226.975           0
## 14 1225.052           0
## 7  1225.221           0
## 6  1226.767           0
## 17 1242.353           0
## 19 1240.478           0
## 18 1241.978           0
## 16 1222.130           0
## 8  1223.310           0
## 20 1239.304           0
## 9  1284.564           0
## 11 1283.681           0
## 10 1284.362           0
## 12 1278.274           0
## 3  1289.117           0
## 1  1291.825           0
## 2  1291.634           0
## 4  1284.782           0

Time-dependent variables in CJS models

Now, we have already learned how we can incorporate static individual covariates. But covariates can also vary between sampling occasions. Here, we are interested how a binary but time-varying variable heat-wave will impact survival and detection probabilities. First, we will create a binary variable. In addition, we will see how continuous individual covariates (here: weight) can be used to describe demographic parameters. Caution, here we just assume weight to be constant across an individuals’ life (of course it is not), but we can’t document the weight of individuals that are not captured. There are ways to incorporate missing data (e.g. imputation or assigning weight classes such as small, medium, large), but we will not cover them here.

# create variable
Heat_wave <- matrix(rep(c(0, 1, 1, 0, 0, 0), each = nrow(starlings)),
                    ncol =6)
# rename columns
colnames(Heat_wave) = paste("Heat_wave", 1:6, sep = "")

#bind rows
starlings_heatwave <- cbind(starlings, Heat_wave)

#generate some random weight 
starlings_heatwave$weight <- round(rnorm(nrow(starlings), 80, 3))


#prepare data
starlings.proc <- process.data(starlings_heatwave)

# Design matrix for survival
design.Phi <-
  list(static = c("weight", "sex"),
       time.varying = c("Heat_wave"))

# Design matrix for detection
design.p <- list(static = c("sex"))

# combine in one list
design.parameters <- list(Phi = design.Phi, p = design.p)

# processed data and design parameters into model design matrix
m3_design <- make.design.data(data = starlings.proc,
                              parameters = design.parameters)

#check the names in the design matrices
names(m3_design$Phi)
##  [1] "id"        "occ"       "time"      "cohort"    "age"       "Heat_wave"
##  [7] "weight"    "sex"       "Time"      "Cohort"    "Age"       "order"
names(m3_design$p)
##  [1] "id"     "occ"    "time"   "cohort" "age"    "sex"    "Time"   "Cohort"
##  [9] "Age"    "fix"    "order"
# Define survival prob.
Phi.heat.sex <- list(formula =  ~ Heat_wave * sex)

# Define detection prob.
p.sex <- list(formula =  ~ sex)

# fit the model
m3 <- crm(
  starlings.proc,
  m3_design,
  hessian = TRUE,
  model.parameters = list(Phi = Phi.heat.sex,
                          p = p.sex)
)
##  Number of evaluations:  100  -2lnl: 1244.121204 Number of evaluations:  200  -2lnl: 1243.974874 Number of evaluations:  300  -2lnl: 1244.044779 Number of evaluations:  100  -2lnl: 1245.196681
predict(m3)
## $Phi
##   Heat_wave    sex occ  estimate         se       lcl       ucl
## 1         0 Female   6 0.5888999 0.03365102 0.5217320 0.6529102
## 2         0   Male   6 0.6029780 0.03050136 0.5419426 0.6609683
## 3         1 Female   3 0.2874165 0.03769815 0.2194742 0.3665152
## 4         1   Male   3 0.4682230 0.04526839 0.3813970 0.5570169
## 
## $p
##      sex occ  estimate         se       lcl       ucl
## 1 Female   7 0.8722827 0.03817459 0.7772436 0.9304042
## 2   Male   7 0.9191637 0.02691341 0.8482694 0.9585520

More complex models & predicting

Here, we are interested in individual weight.

Phi.weight.heat <-list(formula=~weight*Heat_wave)
p.sex <-list(formula=~sex)


m4 <- crm(
  starlings.proc,
  m3_design,
  hessian = TRUE,
  model.parameters = list(Phi = Phi.weight.heat,
                          p = p.sex)
)
##  Number of evaluations:  100  -2lnl: 1252.677436 Number of evaluations:  200  -2lnl:  1255.89512 Number of evaluations:  300  -2lnl: 1252.968609 Number of evaluations:  400  -2lnl: 1252.676557 Number of evaluations:  100  -2lnl: 1257.437678
#predict model output
predict(m4)
## $Phi
##    weight Heat_wave occ  estimate         se       lcl       ucl
## 1      79         0   6 0.6003154 0.02366211 0.5531702 0.6456726
## 2      78         0   6 0.6030211 0.02649719 0.5501114 0.6536262
## 3      76         0   6 0.6084135 0.03577190 0.5365250 0.6758874
## 4      87         0   6 0.5784621 0.05581375 0.4670070 0.6824580
## 5      84         0   6 0.5866981 0.03700753 0.5127970 0.6568898
## 6      77         0   6 0.6057205 0.03070027 0.5442351 0.6640317
## 7      82         0   6 0.5921622 0.02713701 0.5380946 0.6440871
## 8      80         0   6 0.5976036 0.02274455 0.5523348 0.6412679
## 9      81         0   6 0.5948858 0.02399419 0.5471243 0.6409161
## 10     85         0   6 0.5839579 0.04294503 0.4981462 0.6649652
## 11     75         0   6 0.6110998 0.04137545 0.5276484 0.6885113
## 12     83         0   6 0.5894329 0.03163288 0.5263314 0.6497225
## 13     86         0   6 0.5812125 0.04925269 0.4827940 0.6735658
## 14     73         0   6 0.6164522 0.05343803 0.5078826 0.7145325
## 15     74         0   4 0.6137795 0.04730548 0.5180168 0.7014801
## 16     90         0   4 0.5701823 0.07641024 0.4186053 0.7096513
## 17     80         1   3 0.3691765 0.02984352 0.3128306 0.4293305
## 18     86         1   3 0.4125508 0.06882940 0.2869833 0.5506308
## 19     78         1   3 0.3551391 0.03739509 0.2856581 0.4313161
## 20     77         1   3 0.3482111 0.04407812 0.2674567 0.4387446
## 21     81         1   3 0.3762808 0.03081215 0.3180633 0.4383054
## 22     79         1   3 0.3621284 0.03230885 0.3014750 0.4275162
## 23     84         1   3 0.3979045 0.04992160 0.3052082 0.4985524
## 24     83         1   3 0.3906476 0.04173233 0.3125535 0.4747780
## 25     82         1   3 0.3834388 0.03507171 0.3174010 0.4540764
## 26     76         1   3 0.3413467 0.05162750 0.2483609 0.4483786
## 27     87         1   3 0.4199341 0.07901220 0.2771182 0.5775476
## 28     85         1   3 0.4052065 0.05906622 0.2964706 0.5241130
## 29     75         1   2 0.3345482 0.05961159 0.2292578 0.4593738
## 30     88         0   6 0.5757068 0.06255565 0.4509656 0.6914950
## 31     72         0   5 0.6191180 0.05969651 0.4974060 0.7275026
## 32     74         1   3 0.3278177 0.06778289 0.2106540 0.4712427
## 33     72         1   2 0.3145687 0.08415756 0.1759748 0.4965423
## 
## $p
##      sex occ  estimate         se       lcl       ucl
## 1 Female   7 0.8568973 0.03983628 0.7600623 0.9188251
## 2   Male   7 0.9240681 0.02511068 0.8578462 0.9608486

Okay, we get estimates for each measured weight now, this is not really useful (imagine we would measure weight more precisely and end up with many more parameters).

In this case, it is very useful to create a new data frame, and use the predict function on that data:

new_starling <- expand.grid(sex=c("Female","Male"),weight=60:80,Heat_wave1=0,
       Heat_wave2=1,Heat_wave3=1,Heat_wave4=0,Heat_wave5=0,Heat_wave6=0) 

# predict with the newdata option
prediction <- predict(m4,newdata=new_starling,se=TRUE)

# Instead of binary factor, use the actual name
prediction$Phi$Heat_wave = factor(prediction$Phi$Heat_wave, labels = c("No_Heatwave", "Heatwave"))

# plot data
ggplot(prediction$Phi, aes(weight, estimate, ymin = lcl, ymax = ucl)) +
  geom_errorbar(width = 0.2) +
  geom_point() +
  geom_line() +
  xlab("\nWeight") + ylab("Survival\n") + facet_grid(Heat_wave ~ .)

Exercise

Some exercises:

Exercise 6.1

The data set starlings contains one column (“infected”) that represents an infection with an incurable disease (i.e. a static variable comparable to sex). Formulate hypothesis associated with this infection status (write them down!) & try to build simple CMR models. Think about why infection could matter for capture probability or survival. There are some NAs in the data set, maybe try ?na-omit() and remove some of the observations without data from all analyses.Discuss the results in one sentence

Exercise 6.2

In our previous model (m3), we saw that sex & heatwave played an important role in individual survival. Formulate a model that additionally accounts for the habitat type (hint: * or :)

Exercise 6.3 (advanced)

This exercise will require some data-wrangling, the data set is not yet prepared: In the file locations, you will find measured temperatures for each year.

(Hint: You might need dplyr::left_join() and dplyr::rename() to get all data in the right format and bind the rows according to the habitat type.)

Test how temperature affects survival, and how sex affects detection probability. Then predict the survival probabilities for extreme weather scenarios (temperatures of 30,33,35,38,41,44 degree Celsius)

Session Info
Sys.time()
## [1] "2025-03-12 15:19:33 CET"
#git2r::repository() ## uncomment if you are using GitHub
sessionInfo()
## R version 4.3.3 (2024-02-29 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19045)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.utf8 
## [2] LC_CTYPE=English_United Kingdom.utf8   
## [3] LC_MONETARY=English_United Kingdom.utf8
## [4] LC_NUMERIC=C                           
## [5] LC_TIME=C                              
## 
## time zone: Europe/Berlin
## tzcode source: internal
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] d6berlin_1.0.0          tidyr_1.3.1             readr_2.1.5            
##  [4] ggplot2_3.5.0           rnaturalearthdata_1.0.0 devtools_2.4.5         
##  [7] usethis_2.2.3           tmap_3.3-4              sf_1.0-16              
## [10] skimr_2.1.5             marked_1.2.8            lme4_1.1-35.2          
## [13] Matrix_1.6-5            here_1.0.1             
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3  wk_0.9.1            rstudioapi_0.16.0  
##   [4] jsonlite_1.8.8      magrittr_2.0.3      farver_2.1.1       
##   [7] nloptr_2.0.3        rmarkdown_2.26      fs_1.6.3           
##  [10] ragg_1.3.0          vctrs_0.6.5         memoise_2.0.1      
##  [13] minqa_1.2.6         base64enc_0.1-3     terra_1.7-71       
##  [16] htmltools_0.5.8.1   leafsync_0.1.0      truncnorm_1.0-9    
##  [19] s2_1.1.6            raster_3.6-26       sass_0.4.9         
##  [22] pracma_2.4.4        KernSmooth_2.23-22  bslib_0.7.0        
##  [25] htmlwidgets_1.6.4   stars_0.6-5         cachem_1.0.8       
##  [28] TMB_1.9.11          mime_0.12           lifecycle_1.0.4    
##  [31] pkgconfig_2.0.3     optimx_2023-10.21   R6_2.5.1           
##  [34] fastmap_1.1.1       shiny_1.8.1.1       digest_0.6.35      
##  [37] numDeriv_2016.8-1.1 colorspace_2.1-0    rprojroot_2.0.4    
##  [40] prismatic_1.1.1     leafem_0.2.3        pkgload_1.3.4      
##  [43] textshaping_0.3.7   crosstalk_1.2.1     labeling_0.4.3     
##  [46] lwgeom_0.2-14       fansi_1.0.6         abind_1.4-5        
##  [49] compiler_4.3.3      proxy_0.4-27        remotes_2.5.0      
##  [52] bit64_4.0.5         withr_3.0.0         DBI_1.2.2          
##  [55] highr_0.10          pkgbuild_1.4.4      MASS_7.3-60.0.1    
##  [58] sessioninfo_1.2.2   tmaptools_3.1-1     leaflet_2.2.2      
##  [61] classInt_0.4-10     tools_4.3.3         units_0.8-5        
##  [64] httpuv_1.6.15       R2admb_0.7.16.3     glue_1.7.0         
##  [67] nlme_3.1-164        promises_1.3.0      grid_4.3.3         
##  [70] generics_0.1.3      gtable_0.3.4        tzdb_0.4.0         
##  [73] class_7.3-22        rmdformats_1.0.4    data.table_1.15.4  
##  [76] hms_1.1.3           sp_2.1-3            xml2_1.3.6         
##  [79] utf8_1.2.4          pillar_1.9.0        stringr_1.5.1      
##  [82] vroom_1.6.5         ggspatial_1.1.9     later_1.3.2        
##  [85] splines_4.3.3       dplyr_1.1.4         lattice_0.22-6     
##  [88] bit_4.0.5           tidyselect_1.2.1    miniUI_0.1.1.1     
##  [91] knitr_1.46          bookdown_0.38       svglite_2.1.3      
##  [94] xfun_0.43           expm_0.999-9        stringi_1.8.3      
##  [97] yaml_2.3.8          boot_1.3-30         kableExtra_1.4.0   
## [100] evaluate_0.23       codetools_0.2-20    tibble_3.2.1       
## [103] cli_3.6.2           xtable_1.8-4        systemfonts_1.0.6  
## [106] repr_1.1.7          munsell_0.5.1       jquerylib_0.1.4    
## [109] dichromat_2.0-0.1   Rcpp_1.0.12         coda_0.19-4.1      
## [112] png_0.1-8           XML_3.99-0.16.1     ellipsis_0.3.2     
## [115] profvis_0.3.8       urlchecker_1.0.1    viridisLite_0.4.2  
## [118] scales_1.3.0        e1071_1.7-14        crayon_1.5.2       
## [121] purrr_1.0.2         rlang_1.1.3